exploiting weakly supervised visual pattern
Exploiting weakly supervised visual patterns to learn from partial annotations
As classifications datasets progressively get larger in terms of label space and number of examples, annotating them with all labels becomes non-trivial and expensive task. For example, annotating the entire OpenImage test set can cost $6.5M. Hence, in current large-scale benchmarks such as OpenImages and LVIS, less than 1\% of the labels are annotated across all images. Standard classification models are trained in a manner where these un-annotated labels are ignored. Ignoring these un-annotated labels result in loss of supervisory signal which reduces the performance of the classification models. Instead, in this paper, we exploit relationships among images and labels to derive more supervisory signal from the un-annotated labels. We study the effectiveness of our approach across several multi-label computer vision benchmarks, such as CIFAR100, MS-COCO panoptic segmentation, OpenImage and LVIS datasets. Our approach can outperform baselines by a margin of 2-10% across all the datasets on mean average precision (mAP) and mean F1 metrics.
Review for NeurIPS paper: Exploiting weakly supervised visual patterns to learn from partial annotations
Weaknesses: - the method itself is pretty straightforward, but I wonder a bit at some choices. In particular, so much effort goes into computing many different distances (image-level/label-level and positive/negative), but almost all of this information is discarded because only the min value across all of these is used to compute the final temperature term. It just seems at odds with a "soft" penalty to add so many "hard" operations to throw away other potentially relevant information. Why not do some weighted combination? Were any other alternatives explored?
Exploiting weakly supervised visual patterns to learn from partial annotations
As classifications datasets progressively get larger in terms of label space and number of examples, annotating them with all labels becomes non-trivial and expensive task. For example, annotating the entire OpenImage test set can cost 6.5M. Hence, in current large-scale benchmarks such as OpenImages and LVIS, less than 1\% of the labels are annotated across all images. Standard classification models are trained in a manner where these un-annotated labels are ignored. Ignoring these un-annotated labels result in loss of supervisory signal which reduces the performance of the classification models.